All Categories :
Servers
Chapter 19
Getting the Most Out of HTML with CGI
CONTENTS
The Common Gateway Interface (CGI) is a standard that governs
how external applications are interfaced with Web servers. The
reasoning behind the invention of CGI is simple: without it, the
HTTP specification and all Web servers would have become a patchwork
of ad hoc extensions.
CGI provides a way to write programs that will run on the server
when they are invoked by the client Web browser through HTML code.
These programs can be written in the C language, but C is just
one possibility. For a discussion of other options, see the section
later titled "Choosing a CGI Programming Language."
At this point, the astute reader might have noticed that there
are no fewer than four areas of programming prowess needed to
get this dog to hunt: CGI, HTTP, HTML, and C (or some other programming
language). And just for good measure, you might want to throw
in the Win32 API and SQL, depending on what your Web program will
do after you finish laying the necessary foundation.
And as if that is not enough, you'll want to consider writing
your CGI program using the newer ISAPI (Internet Services Application
Programming Interface) standard for better performance. The fundamentals
of ISAPI are very similar to CGI, except that ISAPI programs are
compiled as DLLs rather than EXEs, and they use pointers to memory
blocks instead of stdin/stdout. This book does not substantially
cover ISAPI; however, everything you learn about CGI can be applied
to ISAPI.
The reason you will want to run this challenging gauntlet is that
CGI and ISAPI open the door to great new opportunities. CGI/ISAPI
programs are often associated with Web forms. When the user finishes
filling out an HTML form and submits it, the data stream that
is returned to the server is called the form data. Keep
in mind that just because you send a blank HTML form to the client
Web browser, nothing is going to happen with the form data when
it is submitted unless you, the Webmaster, make it happen. The
form data would just land in the bit bucket if not for CGI or
ISAPI.
CGI/ISAPI is a necessity if you want to save the form data into
a database on the server, for example. Or perhaps the form data
should be e-mailed to the Webmaster or some other party. Maybe
the intent of the form is to have some data faxed or e-mailed
back to the client. Or the form could be used to obtain a database
query from the user, which is then sent to a database engine before
the formatted results are finally returned to the client as an
HTML file. These are just some of the possibilities available
to anyone brave enough to master the details of client/server
Web programming with CGI/ISAPI. (There are tools that make it
possible to do much of this without programming; I will tell you
about several of them later.) Although all these things can be
accomplished with traditional programming, doing it on the Web
makes applications platform-independent, distributed, easier to
develop, and easier to update.
The purpose of this chapter is to give you the fundamentals of
CGI, and show you two simple CGI examples and one sophisticated
and practical example. Because all the source code is on the CD-ROM,
programming knowledge is not required, but let's not kid ourselves-it
would be very helpful. If you don't yet know about programming,
you might just want to skim this chapter to get a glimpse of the
possibilities. On the other hand, if you want to utilize CGI on
your Intranet, this chapter will be a guiding light.
Note |
CGI programs are also called CGI scripts or applications. The reason they are called scripts is that they can be written in Perl, or at the command shell, in which case they are interpreted rather than compiled. When C or Visual Basic is used for CGI, the terms CGI program or CGI application are preferred to the term CGI script because those languages are not interpreted in the traditional sense of script files. Even shorter, some people just refer to all such things as CGIs.
CGI scripts are not to be confused with a new product from Microsoft named Visual Basic Script or a new product from Sun and Netscape named JavaScript. Neither JavaScript nor VBScript are necessarily associated with the Common Gateway Interface.
|
Figure 19.1 shows a high-level overview of how CGI forms-processing
works. There are many other details of HTTP and TCP/IP than what
are shown here, but I omit those in order to concentrate on the
basic concepts of CGI.
Figure 19.1 : How CGI processes Web forms.
The annotated steps corresponding to Figure 19.1 follow. (I assume
you are familiar with the way an HTML file gets created and displayed
in the Web browser, which is the point at which step 1 begins.)
- After the user has entered the form data in the Web browser,
he chooses the Submit button, which is coded between the <FORM>
and </FORM> tags in
the HTML file. The Submit button is a link to a CGI (or ISAPI)
application on the server. For more information about HTML forms,
please review Chapter 5, "What You
Need to Know About HTML."
- The browser uses the POST
method of the HTTP protocol to send the form data to the server.
The GET method could also
be used, but POST is preferred
for form data.
- The data travels through the Intranet or the Internet and
arrives at the server, which then passes the data to the CGI application.
- In addition to parsing the form data and processing it as
desired, the CGI application must write the HTML response that
will be sent back to the client. The CGI specification says that
the Web server should read the stdout device of the CGI application.
- The server adds appropriate HTTP header information and sends
the output of the CGI application back through the network as
an HTML response file, which the Web browser receives in memory.
- The browser interprets the HTML code and displays the results
on-screen for the user. At a minimum, this file should usually
contain some notification that the data was processed by the server,
followed by a hyperlink to take the user back to the HTML page
he was on before choosing the link to the form page. In other
words, the file puts the client back where he was before he came
to Step 1 of this list.
Caution |
Allowing any person with a Web browser to execute applications on your server is a security concern. Ensure that all the CGI applications are isolated to one directory and that no one else has access to that directory. With Microsoft IIS, all CGI applications are kept in a directory called scripts under the server root (by default). Also, be careful about using public-domain CGI applications that have not been tested over time to be secure.
|
Nearly all Web servers conform to the CGI 1.1 standard, which
is a protocol agreement between your application and the Web server.
With most Web servers, CGI applications must be console-mode programs
located within the HTTP data directory tree. By saying console-mode,
I mean that CGI applications cannot be Windows API programs or
GUI programs. However, Microsoft IIS is one of several HTTP servers
that takes advantage of ISAPI. ISAPI permits you to write Windows
DLLs for your CGI applications, and therefore you do have access
to the full Win32 API, includ-ing ODBC functionality.
Of course, it is very unlikely that you would want to write a
GUI CGI or ISAPI application, because that would imply that you
(as the Webmaster) were going to sit at the Web server waiting
to interact with every client that sent data to the server. Remember,
the client never sees the CGI or ISAPI program-they will only
see the HTML output of the program. Nearly all CGI and ISAPI applications
process the form data as background tasks because there could
be hundreds of transactions per minute (depending on how popular
your Web server becomes).
Another major advantage of ISAPI over CGI is performance. ISAPI
passes memory blocks between the server and the application. CGI
relies on launching a new program to process the form data from
every client, and it uses environment variables and disk files
to pass data back and forth.
In UNIX, which is where the Web got its start, CGI applications
are frequently written in C, Perl, or the UNIX shell command language.
In Windows NT, you can use C/C++ or Perl with most servers and
Visual Basic with some. (Well, you can use Visual Basic with any
CGI Web server. I'll show you how in the next chapter.) Many Windows
NT Webmasters run a public-domain Perl 4 interpreter for CGI and
Web site statistics. Perl 5, which includes some nice object-oriented
extensions, has recently arrived on the scene, and you should
definitely give it consideration as a CGI tool on your Intranet.
(See Chapter 29 for more information about
Perl.)
Both Perl and C have their advocacy camps. Perl offers great file
and string handling, and the code is fairly easy to write and
modify. On the other hand, because C is a compiled language, it
offers better efficiency, both from the optimization of the compiled
code and the fact that the interpreter is not launched for every
client submission of form data. In addition, many claim that compiled
programs provide better security than scripts because hackers
can more easily modify the text of a script just before its execution.
In this chapter, I use some DOS command language, some C/C++,
and of course, some HTML. The first example uses a very simple
DOS batch file. The second example is written in C. The third
example, presented in the next chapter, is a practical application
that shows you how to put C++ and Visual Basic together to build
an HTML form that can save data into an ODBC database on the server.
It's okay if you don't plan to learn programming. Most of the
examples are already compiled on the CD-ROM and will run without
your knowing how to program. In this chapter and the next, I have
used Visual C++ 4.0 and Visual Basic 4.0.
The server uses environment variables to pass information to the
CGI application. The environment variables are set after the HTTP
GET or POST
request is received by the Web server (see the next section) and
before the server executes the CGI application. Most environment
variables are fairly standard from server to server, but be aware
that some differences exist. Nothing stops the vendor of a Web
server from adding nonstandard environment variables for use by
their customers.
The CGI standard specifies certain environment variables that
are used for conveying information to a CGI script. The following
subset of those environment variables is supported by most HTTP
servers. If this list seems confusing, don't despair; most CGI
programs don't need to use all these environment variables:
CONTENT_LENGTH-The length
of the content as given by the client.
CONTENT_TYPE-For queries
that have attached information, such as POST
and PUT, this is the content
type of the data.
GATEWAY_INTERFACE-The revision
of the CGI specification to which this server complies. The format
for this variable is CGI/revision.
HTTP_ACCEPT-The MIME types
that the client will accept. The format for this variable is type/subtype.
PATH_INFO-The extra path
information, as given by the client. This variable enables scripts
to be accessed by their virtual pathname.
QUERY_STRING-The information
that follows the ? in the URL that referenced this script. This
is the query information.
REMOTE_ADDR-The IP address
of the remote host making the request.
REQUEST_METHOD-The method
with which the request was made, such as GET,
HEAD, and POST.
SCRIPT_NAME-A virtual path
to the script being executed.
SERVER_NAME-The server's
hostname, DNS alias, or IP address.
SERVER_PORT-The port number
to which the request was sent.
SERVER_PROTOCOL-The name
and revision of the information protocol this request came in
with. The format for this variable is protocol/revision.
SERVER_SOFTWARE-The name
and version of the server software answering the request. The
format for this variable is name/version.
Other HTTP headers received from the client are available in environment
variables of the form HTTP_*.
For instance, the User-Agent header value is available in HTTP_USER_AGENT.
Note that due to the rules of names in certain filesystems, -
(dash) in the header names is replaced by _ (underscore) in the
corresponding environment variable names. An understanding of
the HTTP specification is probably a prerequisite to a full comprehension
of the purpose of some of these environment variables.
The CGI application accesses information about how it was invoked
through the environment variables initialized by the Web server;
it reads any information supplied by the client (in a POST
request) through stdin and sends output to the client through
stdout. This process is pretty simple to understand, once you
get the hang of it. (Isn't that how everything works?)
GET
Versus POST
GET and POST
are two HTTP methods of sending form data to the Web server. When
you write a form in HTML, you should specify which HTTP method
the browser will use when the form data is sent back to the server.
Listing 19.1 is a short block of HTML code that comprises a complete
form. The line numbers are not a part of the HTML code. Note in
line 2 that the form is using Method="POST".
You could just as easily change this line to "GET".
The main difference between GET
and POST is that the CGI
application will receive the POST
data by reading the stdin device, whereas GET
data would be received on the command line and in the QUERY_STRING
environment variable.
Listing 19.1. A short and sweet HTML form.
1. <HTML><HEAD><TITLE>Simple Form</TITLE></HEAD><BODY>
2. <FORM Method="POST"
3. Action="http://domain\cgi-bin\prog.exe">
4. Your Name: <INPUT Name="user" SIZE="30"><P>
5. <INPUT Type=submit Value="Click here to send">
6. </FORM></BODY></HTML>
Usually, your forms will be much more complex than the one in
Listing 19.1, which only contains one input field. Because many
operating systems impose some limit on the length of the command
line, it is usually best to use POST.
On the other hand, if you know your form data is small, you can
use GET.
CGI Command Lines
In the case of a GET request
(or ISINDEX), the form data
will be on the command line and in the QUERY_STRING
environment variable. The command line will contain a question
mark after the application name as the delimiter that marks the
beginning of the form data. Suppose you change the HTML code in
Listing 19.1 to use Method="GET",
and the user types in the string User's
Name in the text field named user.
The command line of the CGI application would look like the following:
\cgi-bin\prog.exe?user=User%27s+Name
The QUERY_STRING environment
variable would look like the following:
user=User%27s+Name.
Your first observation is naturally going to be that this stuff
looks somewhat strange. Your second observation is, hopefully,
that the QUERY_STRING data
appears somewhat more friendly looking than the command line data.
To figure out what's going on with all those funny characters,
recall from line 4 of Listing 19.1 that the input field was named
user. Now that label is being
sent back to you as the first word of QUERY_STRING.
Everything after the equals sign in the QUERY_STRING
represents the data that the user typed into that particular field.
Because more than one field could be used, each one must be named
uniquely in the HTML form and in the QUERY_STRING
data that is sent back to the CGI application.
Remember that the example assumes that the user typed User's
Name with no period on the end. (If he had typed a
period, that would be another story-more about that later.) Checking
the preceding QUERY_STRING
above, notice that you almost have exactly what the user typed,
except for the %27, which
replaces the apostrophe, and the plus sign, which replaces the
space character. HTTP calls for these translations because of
operating system conventions for reserved characters in filenames.
The same mechanism is used by HTTP to pass URLs, so the server
needs to be able to distinguish between the two.
The percent sign is a hex escape character, and the two digits
that follow it are used to indicate the ASCII code of a reserved
character. The apostrophe sign has a hexadecimal code of 27. If
the user typed a period, it would be replaced by %2E.
Not all servers encode these characters because whether they are
reserved or not depends on the operating system. For example,
the apostrophe and the period are legal in some UNIX systems.
The plus sign is simply the convention for encoding space characters.
Another common translation is the dash character encoded as an
underscore.
Finally, if there were other input fields in the HTML form, they
would follow the data of the user field. Each name=value
pair would be separated by an ampersand (&)
character.
Summary of Seven Funny Characters
Table 19.1 is a quick review of the special characters you will
come across in CGI. Some of these conventions make up what is
known as URL-encoding.
Table 19.1. Special characters in CGI.
Special Character | Description
|
+ (plus sign)
| Used in place of space characters in user input.
|
= (equals sign)
| Used to separate the field name from the field value.
|
? (question mark)
| Used to mark the beginning of the form data on the command line.
|
_ (underscore)
| Used to replace dash characters. |
% (percent sign)
| Used to encode reserved ASCII characters, followed by two hex digits.
|
& (ampersand)
| Used as the boundary between name/value pairs for each field in the HTML form.
|
# (number sign)
| Used in URLs to indicate a section within an HTML document, sort of like a bookmark. This character is not strictly related to CGI; it can be used in any URL to an HTML document that contains an <A> tag with a Name attribute (called an anchor).
|
Reading from stdin
Recall that QUERY_STRING
is not used for the POST
method. Because POST is probably
more typical, you need to understand how to read stdin to retrieve
form data. (This is another reason why CGI programs are console-mode
rather than GUI-GUI programs don't have a concept of stdin.)
First, the server will set the CONTENT_LENGTH
environment variable to tell how many bytes to read from stdin.
You must not read more than that amount. Then the POST-invoked
program will read and parse the form data from the stdin device
instead of the QUERY_STRING
environment variable.
Whether you use POST or GET,
have some standard routines in C or Perl to help you perform standard
decoding. The C programs in this chapter include several useful
functions for that purpose. Feel free to customize them and use
them in your own programs. They are public-domain.
Writing to stdout
When the CGI application is done parsing and processing the input
data, it must send a reply to the server. The server will forward
the reply to the client after applying a header as per the rules
of the HyperText Transfer Protocol.
The server will be listening to the stdout device of the CGI application
while the latter is executing. The CGI program can generate HTML
code on-the-fly or refer the server to another document that it
would like to have sent instead. Either you want to compose an
HTML document on-the-fly, or you want to refer to another document
through HTTP, FTP, or Gopher anywhere on the Web. See the following
section titled "A CGI Example in HTML and C" for all
the details about composing an HTML response document from within
the CGI application.
If you want the server to send another document that already exists,
you can use the Location
code. In C, you would execute a printf
statement that looks something like the following:
printf("Location: ftp://FQDN/dir/filename.txt\n\n");
Because you must follow the header information with a blank line,
the example has two newline characters.
Tip |
It is very important that your CGI program prints out an extra blank line after the HTTP header and before the contents of the document that follows the header. A missing blank line is a common source of trouble when trying to debug CGI systems.
|
How to Learn More About CGI
|
The granddaddy of all CGI information centers on the Internet is NCSA, the National Center for Supercomputing Applications at the University of Illinois. Full details of how to write CGI scripts are given in the CGI specification, which can be found online at http://hoohoo.ncsa.uiuc.edu/cgi/. You will find that NCSA has CGI material at all levels from beginning to advanced, as well as a CGI test suite where you can try the programs and see the code. At the time I am writing this, Version 1.1 is the latest CGI specification. It is not available as a single document, but consists of several hyperlinked pages maintained at NCSA.
For further information about CGI, check out these other resources:
- One of the best CGI and HTML documents available anywhere on the Internet is written by Michael Grobe at the University of Kansas. You'll find "An Instantaneous Introduction to CGI Scripts and HTML Forms" at the following URL:
http://kufacts.cc.ukans.edu/info/forms/forms-intro.html
- For an introduction to HTML forms and CGI, see this URL (case-sensitive):
http://www.utirc.utoronto.ca/HTMLdocs/NewHTML/htmlindex.html
- David Robinson has written an independent and detailed version of the CGI specification. Unlike the NCSA specification, his version exists as a single document (which makes it much easier to print), and it gives a description of all CGI environment variables. See this URL:
http://www.ast.cam.ac.uk/%7Edrtr/
- Whether you want to post a question about a CGI roadblock you need help with or just pick up tips by reading the threads of others, the CGI newsgroup is definitely the place to be:
comp.infosystems.www.authoring.cgi.
- Last but not least, don't forget to visit www.yahoo.com. Select Computers/WWW/CGI and browse the many resources available.
|
There is a standard MIME type for plain ASCII text, Content
type: text/plain. This MIME type is useful in a trivial
but interesting example of CGI, which is often used as proof that
the Web server and CGI are installed and running properly. The
idea is to invoke a DOS batch file that echoes the values of the
CGI environment variables on the server back to the Web browser.
All you need to do is save the following text into a file named
trivial.cmd (or copy it from
the CD-ROM) in your cgi-bin
or scripts directory (as
configured in the Web server):
@echo off
echo content-type: text/plain
echo.
set
The set statement in this
simple program prints the values of all the HTTP environment variables.
The output is directed back to the Web browser. Now just write
a line in your home page that links to trivial.cmd
or create a new HTML file such as the following one, which is
named trivial.htm on the
CD-ROM:
<HEAD>
<TITLE>Trivial CGI Test</TITLE>
</HEAD>
<BODY>
<form action="trivial.cmd" method="POST">
<H2>Press the button to run the trivial CGI test.</H2>
<input type="submit" value="Go">
</FORM>
</BODY>
</HTML>
This section covers a complete CGI transaction, from server to
client, back to server, and back to client. This example serves
as a template from which you could build a more sophisticated
CGI application. The CGI system you are going to build here starts
with an HTML file that contains a form. When the user submits
the form data, the server will determine that the Action
attribute for the form refers to a CGI application. The server
will start the application and send it the form data on stdin.
Then the server will listen for stdout from the CGI application.
The CGI program is written in C. The program will show you how
to retrieve the form data, parse it, and send back an HTML document.
The HTML response is constructed within the CGI application because
you should embed part of the form data in your response. You don't
always have to create HTML on-the-fly from inside the CGI program,
but doing so will make your Web pages more dynamic.
The Data Entry Form
To demonstrate CGI, you need to start with an HTML page that contains
a URL pointing to a CGI application. Figure 19.2 shows how the
data entry form appears in the Web browser as the user is filling
it out.
Figure 19.2 : The data entry form that the user fills out.
Listing 19.2 shows the HTML code that gets the ball rolling with
the sample CGI program. This file and the following C program
are available on the CD-ROM if you want to experiment. Note that
you will want to change the URL in the FORM
ACTION variable to refer to your site (or use localhost).
Listing 19.2. The HTML code that creates the form.
<HTML>
<HEAD>
<TITLE>CGI Application Example</TITLE>
</HEAD>
<BODY>
<H1>CGI Application Example</H1>
<hr>
This is an example of a simple CGI application
handling the data from an HTML form.
<BR>
<FORM ACTION="http://www.hqz.com/scripts/cgisamp.exe" METHOD="Post">
Please enter your name: <INPUT NAME="name" TYPE="text"><p>
<input type=submit value="When done, click here!">
</FORM>
</BODY>
</HTML>
The C Code
Before getting to the C program that will process the form data,
consider the output of the C program. Listing 19.3 is the HTML
code that is sent back to the client after the server obtains
it from stdout of the CGI application.
Listing 19.3. The HTML code that is written to stdout by
cgisamp.c.
<HEAD><TITLE>Submitted OK</TITLE></HEAD>
<BODY><h2>The information you supplied has been accepted.
<br> Thank You Scott</h2>
<h3><A href="http://www.hqz.com/cgisamp.htm">
[Return]</a></h3></BODY>
Figure 19.3 shows the browser on the client side after the CGI
application has finished processing the form data. Note in Figure
19.3 (and Listing 19.3) that the HTML response sent by the CGI
application is customized for each set of form data; it includes
the name that the user supplied.
Figure 19.3 : The result of the CGI application as seen by the client.
Listing 19.4 shows the complete C program, called cgisamp.exe,
which is executed by the server when the client submits the form
data. The following is a quick list of the five functions in cgisamp:
- strcvrt-Converts all
occurrences of one character to another within a given string.
- TwoHex2Int- Called when
a percent character marks an escape code.
- UrlDecode-Expands all
the escape codes by calling TwoHex2Int.
- StoreField-Retrieves
field/value pairs from the form data.
- main-Reads the form data
from stdin and writes the
HTML response to stdout.
Listing 19.4. The CGI application written in C language (cgisamp.c
on the CD-ROM).
/***************************************************************************
* File: cgisamp.c
*
* Use: CGI Example Script.
*
* Notes: Assumes it is invoked from a form and that REQUEST_METHOD is POST.
* Ensure that you compile this script as a console mode app.
*
* This script is a modified version of the script that comes with EMWAC
* HTTPS.
*
* Date: 8/21/95
* Christopher L. T. Brown clbrown@netcom.com
*
***************************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <io.h>
char InputBuffer[4096];
static char * field;
static char * name;
/* Convert all cOld characters */
/* in cStr into cNew characters. */
void strcvrt(char *cStr, char cOld, char cNew)
{
int i = 0;
while(cStr[i])
{
if(cStr[i] == cOld)
cStr[i] = cNew;
i++;
}
}
/* The string starts with two hex */
/* characters. Return an integer */
/* formed from them. */
static int TwoHex2Int(char *pC)
{
int Hi, Lo, Result;
Hi = pC[0];
if('0' <= Hi && Hi <= '9')
Hi -= '0';
else if('a' <= Hi && Hi <= 'f')
Hi -= ('a' - 10);
else if('A' <= Hi && Hi <= 'F')
Hi -= ('A' - 10);
Lo = pC[1];
if('0' <= Lo && Lo <= '9')
Lo -= '0';
else if('a' <= Lo && Lo <= 'f')
Lo -= ('a' - 10);
else if('A' <= Lo && Lo <= 'F')
Lo -= ('A' - 10);
Result = Lo + 16 * Hi;
return(Result);
}
/* Decode the given string in-place */
/* by expanding %XX escapes. */
void urlDecode(char *p)
{
char *pD = p;
while(*p)
{
if (*p == '%') /* Escape: next 2 chars are hex */
{ /* representation of the actual character.*/
p++;
if(isxdigit(p[0]) && isxdigit(p[1]))
{
*pD++ = (char)TwoHex2Int(p);
p += 2;
}
}
else
*pD++ = *p++;
}
*pD = '\0';
}
/* Parse out and store field=value items. */
/* Don't use strtok! */
void StoreField(char *f, char *Item)
{
char *p;
p = strchr(Item, '=');
*p++ = '\0';
urlDecode(Item);
urlDecode(p);
strcvrt(p, '\n', ' ');
strcvrt(p, '+', ' '); /* Get rid of those nasty +'s */
field = f; /* Hold on to the field just in case. */
name = p; /* Hold on to the name to print*/
}
int main(void)
{
int ContentLength, x, i;
char *p,
*pRequestMethod,
*URL,
*f;
/* Turn buffering off for stdin.*/
setvbuf(stdin, NULL, _IONBF, 0);
/* Tell the client what we're going to send */
printf("Content-type: text/html\n\n");
/* What method were we invoked through? */
pRequestMethod = getenv("REQUEST_METHOD");
/* Get the data from the client */
if(strcmp(pRequestMethod,"POST") == 0)
{
/* according to the requested method.*/
/* Read in the data from the client. */
p = getenv("CONTENT_LENGTH");
if(p != NULL)
ContentLength = atoi(p);
else
ContentLength = 0;
if(ContentLength > sizeof(InputBuffer) -1)
ContentLength = sizeof(InputBuffer) -1;
i = 0;
while(i < ContentLength)
{
x = fgetc(stdin);
if(x == EOF)
break;
InputBuffer[i++] = x;
}
InputBuffer[i] = '\0';
ContentLength = i;
p = getenv("CONTENT_TYPE");
if(p == NULL)
return(0);
if(strcmp(p, "application/x-www-form-urlencoded") == 0)
{
p = strtok(InputBuffer, "&"); /* Parse the data */
while(p != NULL)
{
StoreField(f, p);
p = strtok(NULL, "&");
}
}
}
URL = getenv("HTTP_REFERER"); /* What url called me.*/
printf("<HEAD><TITLE>Submitted OK</TITLE></HEAD>\n");
printf("<BODY><h2>The information you supplied has been accepted.");
printf("<br> Thank You %s</h2>\n", name);
printf("<h3><A href=\"%s\">[Return]</a></h3></BODY>\n", URL);
return(0);
}
Notice the calls in the main
routine to the C library function getenv.
That is how the program can determine if the REQUEST_METHOD
is equal to POST and how
many bytes it should read by checking CONTENT_LENGTH.
Another very important point to make about the main
function is that it must output a partial HTTP header to go with
the HTML document that it creates. This line appears near the
top of the function:
printf("Content-type: text/html\n\n");
You might want to add error handling later, in which case you
would probably create an alternative HTML response document. The
HTTP header would need to be printed in any case. The CGI convention
requires that the header be followed by a blank line before the
HTML code that is sent. That is why the printf
statement includes two newlines at the end. Please forgive my
frequent reminders, but this point is important.
The content type indicates a MIME encoding that tells the client
browser that the data stream to follow is HTML code in ASCII format.
There are several standard MIME encoding types. See the CGI specification
for further information.
Getting CGI systems to work properly obviously requires the ability
to integrate several sophisticated tools. And what should a good
software engineer do when faced with the challenge of building
a complex system? One proven approach is to establish clear milestones
to reach the overall goal, build the software one piece at a time
(preferably as black boxes with as few interfaces as possible),
and test each module separately as you go to prove that the milestones
are met successfully.
For example, test the HTML form independently from the CGI program.
You might even take the time to build a test environment for the
CGI application so that you can verify its input/output completely
independent of any interaction with the Web server. Doing this
could yield a great payback when it comes time to debug or enhance
the system, especially if it is a large application or if it interfaces
with a database. The goal is to reduce the edit/compile/link/test
cycle down to as tight a loop as possible. A test environment
that doesn't involve running the server, launching the browser,
and filling out the form will yield significant time savings over
the long run.
Before trying to write your own CGI applications, consider letting
someone else do it for you. This section discusses several CGI
toolkits that are available on the Internet or the CD-ROM. Whether
you are just counting visitors at your site, tabulating more advanced
statistics, or running a customer support form, there is bound
to be something here that will help you make your Intranet come
to life.
CGI PerForm
CGI PerForm was designed to work with both Windows NT and Windows
95 and provide all the basic CGI functionality needed by a WWW
site, without requiring C or Perl. With a simple command file,
template file, and HTML form, you can create an e-mail feedback
form, guest book, or even a ballot box-or perform all three of
those operations at the same time and as many times as you want.
For more information about CGI PerForm, visit this URL:
http://www.rtis.com/nat/software/
How CGI PerForm Works
You can break down an interactive WWW page into three pieces:
- The HTML form through which the data is
typed in and submitted
- The Common Gateway Interface (CGI) application
that receives and processes the submitted data
- The end result
CGI PerForm is one example of the CGI application that handles
the incoming data and creates the result. A result can be a combination
of more than one task or command. PerForm commands are discussed
thoroughly in the online documentation that accompanies the product.
CGI PerForm uses a command file you create to determine what tasks
it needs to perform on the data. A different command file is created
for every interactive application needed. Each command requires
certain key values in order for it to perform its task. A majority
of the key values are filenames. Some of these files must already
exist, such as a template file or a column file. Others are created
by the command, such as a data file or the output file.
CGI PerForm takes all the incoming data supplied by the HTML form
and stores it into a memory block. An HTML form supplies data
in name=value pairs,
for example, lastname=Smith.
You can supplement the data supplied by the HTML form by plugging
in hard-coded name=value
pairs in the command file. These values go into the same memory
block as the submitted data. You can hard-code values in your
command file to hide them or to set defaults.
The next step is to use the data. You can save the data to a data
file or a database or combine the results with a template file
to create a confirmation message or a form letter to be mailed.
The command can be performed as often as necessary with different
key values. For example, you could save data submitted by a user
into three different data files. These data files can have some
of the same data as another, or two of them could be identical.
You can also pass variables between command blocks to create unique
files in which to store data at the user's request.
CGI2Shell 2.0
If you find yourself using a lot of CGI scripts, you'll like this
little utility package from Richard Graessler of Germany. See
this link for more information (and other utilities):
http://rick.wzl.rwth-aachen.de/rickg/index.html
The CGI2Shell applications are intended for Windows Web servers
that do not support the execution of scripts without a corresponding
shell in the command line of a <FORM
ACTION=> or <A HREF=>
tag.
The CGI2Shell Gateway is a set of programs that enable PATH_INFO
to specify the name of a CGI script that will be executed with
either the POST or GET
method. Currently, the shells Perl.exe
and Sh.exe, and the Windows
NT command interpreter cmd.exe,
are supported.
Using CGI2Shell
The CGI2Shell Gateway includes three programs, one for each shell
it supports:
CGI2Sh.exe for Sh.exe
CGI2Perl.exe for Perl.exe
CGI2Cmd.exe for Cmd.exe (Windows NT only)
All you need to do is include the script with its path in the
PATHINFO of the URL. For
example:
http://host.domain/progpath/CGI2xxx.exe/scriptpath/script.ext
The shell programs must reside in the path or the same directory
as CGI2xxx.exe.
CGIC
CGIC is a library of functions for CGI development with ANSI standard
C written by Thomas Boutell. You can find more information about
it at the following URL:
http://sunsite.unc.edu/boutell/cgic/cgic.html
EIT's CGI Library
Enterprise Integration Technology has created LIBCGI to assist
programmers who are writing CGI systems in C language. The library
consists of about 15 functions, and it is freeware. Originally
written for UNIX as part of their Webmaster's Starter Kit, it
has been ported to several other popular platforms, including
Windows. As with several URLs mentioned in this book, I have not
tried this product and cannot endorse it other than to suggest
that you visit their site and have a look for yourself: http://wsk.eit.com.
Web Developers Warehouse
If you program with Borland C++ and don't mind paying for CGI
and HTML tools, you should definitely drop by http://htechno.com/wdw/index.htm.
A company called Specialized Technologies has developed a suite
of products they call the Web Developers Warehouse. It includes
three components: TCgi, HTML Objects, and Web Wizard. TCgi is
a set of C++ classes for WinCGI, which works with the O'Reilly
WebSite server and the FolkWeb server.
Visit their home page for more information. You can also try the
demonstration programs, pay for the software electronically, and
download it pronto.
The next chapter picks up where this one leaves off by showing
you how to develop a functional CGI database application. The
source code in HTML, Visual C++, and Visual Basic is ready to
run from the CD-ROM, but the material is not for the faint at
heart when it comes to programming. If you're ready to take a
big step beyond static Web pages, read on.

Contact
reference@developer.com with questions or comments.
Copyright 1998
EarthWeb Inc., All rights reserved.
PLEASE READ THE ACCEPTABLE USAGE STATEMENT.
Copyright 1998 Macmillan Computer Publishing. All rights reserved.